4 research outputs found

    Topic Distiller:distilling semantic topics from documents

    Get PDF
    Abstract. This thesis details the design and implementation of a system that can find relevant and latent semantic topics from textual documents. The design of this system, named Topic Distiller, is inspired by research conducted on automatic keyphrase extraction and automatic topic labeling, and it employs entity linking and knowledge bases to reduce text documents to their semantic topics. The Topic Distiller is evaluated using methods and datasets used in information retrieval and automatic keyphrase extraction. On top of the common datasets used in the literature three additional datasets are created to evaluate the system. The evaluation reveals that the Topic Distiller is able to find relevant and latent topics from textual documents, beating the state-of-the-art automatic keyphrase methods in performance when used on news articles and social media posts.Semanttisten aiheiden suodattaminen dokumenteista. Tiivistelmä. Tässä diplomityössä tarkastellaan järjestelmää, joka pystyy löytämään tekstistä relevantteja ja piileviä semanttisia aihealueita, sekä kyseisen järjestelmän suunnittelua ja implementaatiota. Tämän Topic Distiller -järjestelmän suunnittelu ammentaa inspiraatiota automaattisen termintunnistamisen ja automaattisen aiheiden nimeämisen tutkimuksesta sekä hyödyntää automaattista semanttista annotointia ja tietämyskantoja tekstin aihealueiden löytämisessä. Topic Distiller -järjestelmän suorituskykyä mitataan hyödyntämällä kirjallisuudessa paljon käytettyjä automaattisen termintunnistamisen evaluontimenetelmiä ja aineistoja. Näiden yleisten aineistojen lisäksi esittelemme kolme uutta aineistoa, jotka on luotu Topic Distiller -järjestelmän arviointia varten. Evaluointi tuo ilmi, että Topic Distiller kykenee löytämään relevantteja ja piileviä aiheita tekstistä. Se päihittää kirjallisuuden viimeisimmät automaattisen termintunnistamisen menetelmät suorituskyvyssä, kun sitä käytetään uutisartikkelien sekä sosiaalisen median julkaisujen analysointiin

    Suodattimien herkkyyskertoimien laskeminen symbolisesti

    No full text
    Tässä tutkielmassa perehdyttiin suodattimien herkkyyskertoimien laskemiseen symbolisesti ja numeerisesti. Tutkielmassa kuvataan, miten herkkyyskertoimet voidaan laskea suodattimen siirtofunktiosta Maxima–matematiikkaohjelmalla, ja sitä miten ne voidaan simuloida piirikaaviosta LTSpice-piirisimulointiohjelmalla, jossa ei ole sisäänrakennettua herkkyyskerroinanalyysiä. Suodatintyyppeinä tutkielmassa käytettiin Sallen-Key I, II ja III tyypin suodattimia.This study deals with symbolic and numeric computing of sensitivity factors of filters. The study outlines how sensitivity factors can be derived from the transfer functions of filters using a mathematics software Maxima, and how the sensitivity factors can be simulated using LTSpice simulation software that has no built-in sensitivity analysis. The filters being studied are Sallen-Key I, II and III type filters

    Catchem:a browser plugin for the Panama papers using approximate string matching

    No full text
    Abstract The Panama Papers is a collection of 11.5 million leaked records that contain information for more than 214,488 offshore entities. This collection is growing rapidly as more leaked records become available online. In this paper, we present a work in progress on a web browser plugin that detects company names from the Panama Papers and alerts the user by means of unobtrusive visual cues. We matched a random sample of company names from the Public Works and Government Services Canada registry against the Panama Papers using three different string matching techniques. Monge-Elkan is found to provide the best matching results but at increased computational cost. Levenshtein-based approach is found to provide the best tradeoff between matching and computational cost, while Jacquard index like approach is found to be less sensitive to slight textual change

    A comprehensive model for measuring real-life cost-effectiveness in eyecare:automation in care and evaluation of system (aces-rwm™)

    No full text
    Abstract This paper describes a holistic, yet simple and comprehensible, ecosystem model to deal with multiple and complex challenges in eyecare. It aims at producing the best possible wellbeing and eyesight with the available resources. When targeting to improve the real-world cost-effectiveness, what gets done in everyday practice needs be measured routinely, efficiently and unselectively. Collection of all real-world data of all patients will enable evaluation and comparison of eyecare systems and departments between themselves nationally and internationally. The concept advocates a strategy to optimize real-life effectiveness, sustainability and outcomes of the service delivery in ophthalmology. The model consists of three components: (1) resource-governing principles (i.e., to deal with increasing demand and limited resources), (2) real-world monitoring (i.e., to collect structured real-world data utilizing automation and visualization of clinical parameters, health-related quality of life and costs), and (3) digital innovation strategy (i.e., to evaluate and benchmark real-world outcomes and cost-effectiveness). The core value and strength of the model lies in the consensus and collaboration of all Finnish university eye clinics to collect and evaluate the uniformly structured real-world outcomes data. In addition to ophthalmology, the approach is adaptable to any medical discipline to efficiently generate real-world insights and resilience in health systems